Multiplicative versus additive noise in multi-state 

neural networks 

D. BoUe^'t J. Busquets Blanco"'} T. Verbeiren"'^ 
"Instituut voor Theoretische Fysica, Katholieke Universiteit Leuven 
Celestijnenlaan 200D, B-3001 Leuven, BELGIUM 



Abstract 

The effects of a variable amount of random dilution of the synaptic couplings in 
Q-Ising multi-state neural networks with Hebbian learning arc examined. A fraction 
of the couplings is explicitly allowed to be anti-Hebbian. Random dilution represents 
the dying or pruning of synapses and, hence, a static disruption of the learning process 
which can be considered as a form of multiplicative noise in the learning rule. Both 
parallel and sequential updating of the neurons can be treated. Symmetric dilution in 
the statics of the network is studied using the mean-field theory approach of statistical 
mechanics. General dilution, including asymmetric pruning of the couplings, is exam- 
ined using the generating functional (path integral) approach of disordered systems. It 
is shown that random dilution acts as additive gaussian noise in the Hebbian learning 
rule with a mean zero and a variance depending on the connectivity of the network 
and on the symmetry. Furthermore, a scaling factor appears that essentially measures 
the average amount of anti-Hebbian couplings. 

Keywords: neural networks, multi-state neurons, stochastic dynamics, thermody- 
namics, dilution, additive noise, multiplicative noise. 



1 Introduction 

In general, artificial neural networks have been widely applied to memorize and retrieve 
information. During the last number of years there has been considerable interest in neural 
networks with multi-state neurons using the framework of statistical mechanics, which 
deals with large systems of stochastically interacting microscopic elements (see, e.g., [1] and 
references cited therein). In these models, the firing states of the neurons or their membrane 
potentials are the microscopic dynamical variables. Basically, compared to models with 
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two-state (=binary) neurons, such models can function as associative memories for grey- 
toned or colored patterns [2,3] and/or allow for a more complicated internal structure of 
the recall process, e.g., a distinction between the exact location and the details of a picture 
in pattern recognition and the analogous problem of retrieval focusing in the framework of 
cognitive neuroscience [4] , a combination of information retrieval based on skills and based 
on specific facts or data [5,6]. 

Different types of multi-state neurons can be distinguished according to the symmetry 
of the interactions between the different states. Here we are primarily interested in the so- 
called Q-Ising neuron, the states of which can be represented by scalars, and the interaction 
between two neurons can then be written as a function of the product of these scalars. So, 
the Q-states of the neuron can be ordered like a ladder between a minimum and a maximum 
value, usually taken to be —1 and Special cases are Q = 2, i.e., the well-known Hopfield 
model [7] and Q = oo, i.e., networks with analogue or graded response neurons [8-10]. 

In analogy to the Hopfield model, the multi-state neuron models we discuss here have 
their immediate counterpart in random magnetic systems (=spin-glasses) (cfr., e.g., [11] 
and [12]), but with couplings defined in terms of embedded patterns through a learning 
rule. Since one of the aims of these networks is to find back the embedded patterns as 
attractors of the recall process, they are also interesting from the point of view of dynamical 
systems. 

This close relation with spin-glass systems means that the methods and techniques used 
to study the latter have been successfully applied to these network models. In particular, 
it also means that concepts like temperature, fiuctuations, disorder, noise, stochasticity 
...play a crucial role. In the literature it is well-known (see, e.g., [13]) that noise can 
have rather surprising and counterintuitive effects in the behavior of dynamical systems. 
It has been shown many times that noise can have a constructive rather than a destructive 
role. Relevant examples [13] of this fact are the phenomenon of stochastic resonance, noise 
induced ordering transitions, noise induced disordering phase transitions and an increase 
of the maximal information content with dilution in some neural network models [14] . In 
principle, these ordering effects seem to be related to the multiplicative character of the 
fiuctuations, as compared to the disordering role of additive fiuctuations. But things are 
not so simple because there is an interplay between additive and multiplicative noise terms. 

Moreover, there are several types of internal fiuctuations, e.g., thermal fiuctuations 
introduced through a random term, quite often assumed to be gaussian distributed with 
zero mean and uncorrelated at different times. These internal fiuctuations are described 
by an additive noise term, i.e., a random term that docs not depend on the variable under 
consideration. This is not a necessary character of internal fiuctuations. In some systems 
these can also be described by a multiplicative noise which is coupled to the state of the 
system. As a natural extension of the concept of internal fiuctuations, external noise are 
those fiuctuations that are not of thermal origin. 

In the Q-Ising neural network models we are considering here, the following noise can 
be characterized. First, the neurons are stochastic such that the analogue of temperature is 
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introduced. This enables us, as hinted at aheady above, to use the techniques of statistical 
mechanical mean- field theory, and ultimately to compute, e.g., the storage capacity of the 
network [1]. The zero-temperature limit will always reduce our system to a deterministic 
Hopfield or multi-state Q-Ising network. The meaning of this stochastic behavior is to 
model that neurons fire with variable strength, that there are delays in synapses, that 
there are random fluctuations in the release of transmitters .... Briefly, we model this 
internal noise by thermal fluctuations. 

Secondly, in single pattern recall with many - a fraction of the size of the network - 
stored patterns there is the generally nontrivial interference noise due to the other patterns. 
This noise has been treated, e.g., by statistical neurodynamics [1, 15, 16] or functional 
integration methods [14,17-19]. 

Thirdly, in the case of the Hopfield model the Hebb learning rule has been generalized 
in [20] in order to bring the model closer to natural systems. In particular, two types of 
noise terms have been added. The first one, an additive external contribution which is 
independent of the learning algorithm, and assumed to be gaussian distributed, is relaxing 
the hypothesis that the entire synaptic efficacy is coming from the learning process. The 
second one, a random multiplicative factor of order 0{l), represents a static disruption 
of the learning process. An important example of the latter is random dilution of the 
network by the priming or dying of synapses, relaxing the unrealistic condition that every 
neuron is connected to every other one. The effects of both these static fluctuations on 
the recall process in the Hopfield model have been estimated using equilibrium mean- 
field theory statistical mechanics. Technically speaking, the use of this method allows for 
symmetric dilution only, because the detailed balance principle, i.e., absence of microscopic 
probability currents in the stationary state, is needed to define an energy function. An 
additional remark is that non-linear updating of the synapses is allowed in that work. It 
has been shown that all these effects can be represented by an additive static gaussian noise 
in the learning rule and that the model is robust against the interference of this static noise. 

In this contribution we extend the work of Sompolinsky [20] in different directions. Since 
we use both replica mean-field theory equilibrium methods and non-equilibrium functional 
integration techniques, the assumption of symmetric couplings is not required such that 
we can treat all forms of dilution. Moreover, we allow a fraction of the couplings to be 
anti-Hebbian [21]. We can also have sequential and parallel updating of the neurons, and 
we examine the effect of these noise terms in Q-Ising networks. The main results are that 
for the forms of dilution we have examined the effects can be represented by additive noise 
in the learning rule and a scaling factor proportional to the average amount of anti-Hebbian 
couplings. The diluted networks are robust under these effects. 

The rest of this paper is organized as follows. In Section 2 we introduce the Q-Ising 
neural network model and the types of dilution we are interested in. Section 3 treats 
the statics of the model, where detailed balance requires, as we will explain, symmetric 
dilution. In Section 4 the dynamics of the network is studied allowing for a general form 
of dilution. Section 5 presents some concluding remarks. 
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2 Q-Ising networks with variable dilution 



Consider a neural network consisting of neurons which can take values ai,i = 1, . . . ,N 
from a discrete set S = {—1 = si < S2 < ■ ■ ■ < sq = +1}. The p patterns to be stored 
in this network are supposed to be a collection of independent and identically distributed 
random variables (i.i.d.r.v.), {^f G S}, fj, = 1, . . . ,p, with zero mean, (^f ) = 0, and variance 
A = (('Cf )^)- The latter is a measure for the activity of the patterns. We remark that 
for simplicity we have taken these variables Sk,k = 1, . . . ,Q equidistant and we have also 
taken the patterns and the neurons out of the same set of variables, but this is no essential 
restriction. Given the configuration cr7v(t) = {(Tj{t)},j = 1, . . . ,N, the local field in neuron 
i equals 

N 

hi{aN{t)) = ^Jij{t)aj{t) (1) 
i=i 

with Jij the synaptic coupling from neuron j to neuron i. 

All neurons are updated sequentially or in parallel through the spin-flip dynamics de- 
fined by the transition probabilities 

The configuration criy{t = 0) is chosen as input. Here the energy potential ej[s|<Tjv(t)] is 
defined by 

ei[s\aN{t)] = --[hi{aN{t))s - bs'^] , (3) 

where 6 > is the gain parameter of the system. The zero temperature limit T = [3^^ 
of this dynamics is given by the updating rule 

ai{t) ai{t + 1) = Sk such that minej[s|o-7v(t)] = ei{sk\o- N{t)] ■ (4) 
This updating rule ^ is equivalent to using a gain function gfc(-), 

ai{t + l) = gf,{hi{cTNit))) 

Q 

Sbi^) = '^Sk[9 (b{sk+i + Sk) - x) - e {b{sk + Sk-i) - x)] (5) 

k=l 

with So = — oo and sq+i = +oo and 9{-) the Heaviside function. For finite Q, this gain 
function gf,{-) looks like a staircase with Q steps. The gain parameter b controls the average 
slope of gf,{-) and, hence, suppresses or enhances the role of the states around zero. 

It is clear that the Jij explicitly depend on the architecture. We are interested in 
architectures with variable dilution and we also want to allow a fraction of the couplings 
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to be anti-Hebbian. We realize this by choosing the couplings according to the Hebb rule 
multiplied with a factor Cij 

/i=i 

with the {cij = 0,ibl},i,j = 1,...,A^ chosen to be i.i.d.r.v and obeying, in general, a 
distribution of the form 

P[Cij = x]= Ci(5^,i + C2(^a; -1 + (1 - Ci - 02)5^,0 (7) 

with c = ci + C2 = {\cij\) = (cfj) the connectivity, i.e., the average number of connections 
per neuron given by cN. In order to allow for variable symmetry as well, we define a 
joint-probability distribution for i < j {ca = 1) 

Tir/ \ / M f 2 . + '^'^ + : r , 2 , ~ + W \ ^ 

F[{cij,Cji) = {x,y)\ = I q H I dx,idy^i + I C2 H I dx-idy^^i 

+ - Cl - C2) - — i^X,lSy^O + 6xfiSy^l) 

+ ^C2(l - Cl - C2) + — ('5z,0<5y,-l + '5x,-l'^j/,o) 
+ ^CiC2 ^ ^ {6x,l5y-i + _i(5j/,i) 

+ ((1 - Cl - C2)2 + u-) 5^,o<5s,,o (8) 

with 

U = {CijCji) - {Cij){Cji) = {CijCji) - (Cl - C2)^ 
V = {c]jCji) - {4j){Cji) = {cljCji) - c(ci - C2) 

w = {c%cl) - (cf,) {cD = {c%cl) - c\ (9) 
We note that these expressions are symmetric under the change i ^ j. 

These distributions generalize the following cases of random dilution frequently dis- 
cussed in the literature (see, e.g., [1]) 

• Symmetric dilution where Cij = Cji (SD): Due to the symmetry, u = c — (ci — C2)^, 
V = (ci — C2)(l — Cl — C2) and w = c(l — c), yielding for eq. (jEJ 

P[(Cij, Cji) = (X, y)] = Cl5x,l5y^l + C25x,^lSy-l + {I - C) 5;,fi5yfi (10) 

whereby in most cases C2 is taken to be zero, indicating that there are no anti-Hebbian 
couplings mixed in explicitly. 
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• Asymmetric dilution with Cij ^ Cji and C2 = (AD): In this case u = v = w = 
(cjiCji) — (? and the joint-probabihty distribution eq. © becomes 

P[(cij, Cjj) = (x, y)\ ={u + c^)4,i(5y,i + (c(l - c) - n)(5a;,i5j^,o + ^xfiSy,i) 

+ {l + u-2c + c^)6^^o6y^o. (11) 

The meaning of the variable dilution as introduced in eq.©-® can best be understood 
from the theory of random graphs with nodes and p the probability that any two of 
them are connected (see, e.g., [22]). The number of connections a node has for ^ oo 
(i.e., in the thermodynamic limit) goes to a Poisson distribution 

P(^) = e<'^>^ (12) 
n! 

telling us that (n) = pN is the average number of connections per node. In order to 
indicate what range of dilution we allow for, we look at the diameter of the random graph, 
d, i.e., the maximum distance between any pair of nodes. This diameter is concentrated 
around 

log (AT) _ log(A^) 

log((n)) log(pAr)- ^ > 

The diameter is clearly 1 in the case that there is an edge with probability 1 and, hence, 
we have a fully connected graph. The diameter diverges when p ~ 1/A^. Given that p 
scales with A^ like p ~ A^^, several regimes containing different types of subgraphs can be 
distinguished as a function of z [22]. In particular, it can be shown that the precise point 
above which the system becomes completely connected (such that it is always possible to 
find a path between any two nodes) is (n) > log (A^). 

Looking at eq. we find that this describes precisely a random graph with c = ci + C2 
being the probability to have an edge. It follows that the average number of connections per 
neuron is cN. Since our Q-Ising network is taken to be a mean-field system characterized 
by an extensive number of long-range interactions we need to have that (n) = cN tends to 
infinity for N ^ oo and all c. This implies that 

log^) 

A^ 

such that cN = log (A^) for c = and the complete connectivity of our model is still 
guaranteed. In this way, the diameter of our network is d = 1 for Vc > and d = oo when 
c = 0, and the average number of connections is given by (n) = cN + log (A^). The limit 
c ^ 0, i.e., the so-called extremely diluted limit has now a simple interpretation: each 
neuron has an infinite number of neighboring neurons but such that the average distance 
between any two neurons tends to infinity. All this is graphically illustrated in figure 1. 

We remark that in this way we can understand that eq. © has a well-defined meaning. 
There is a factor cN in the denominator which always tends to infinity in the thermody- 
namic limit N ^ oo, whatever the value of c. Finally, we note that a distribution for 



P(|ci,| = l) = ^^^ + c (14) 
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Figure 1: Effect of dilution. Top left: c = 1 (fully connected); top right: < c < 1 (some 
couplings are cut); bottom left: c = (extreme dilution, tree-like structures); and bottom 
right: not all sites are connected (= finite connectivity). 



the Cij can be chosen that does not support complete connectivity (see figure 1 bottom 
right), e.g., by taking c = c/N, with c a fixed number independent of N. The system then 
consists of disjoint clusters of different sizes (=finite connectivity) [23]. These models are 
much more difficult to handle and are outside the scope of the present work. 

3 Symmetric dilution and statics of the Q-Ising model 

It is well-known that networks with symmetric couplings Jij obey the detailed balance prin- 
ciple. Systems with detailed balance can be described by standard equilibrium statistical 
mechanics making use of a Hamiltonian. The Q-Ising neural network we have defined in 
Section 2 is of a mean-field type with couplings © of infinite range and restricting ourselves 
to symmetric dilution Cij = cji (cfr., eq. (fTU)) ! the couplings Jfj remain symmetric. The 
long time behavior of this network for sequential updating of the neurons is then governed 
by the following Hamiltonian 



For parallel updating of the neurons a Hamiltonian is defined in terms of a two-spin rep- 
resentation [24] as 




(15) 




(16) 
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The thermodynamic properties of the system are then determined from the free energy 
using the standard techniques of rephca mean- field theory [25,26]. It is outside the scope 
of this contribution to go into the very details of these techniques but we indicate how the 
dilution is going to affect the calculations and the results. 

The free energy is given as the logarithm of the partition function averaged over the 
disorder. This average is done using the replica technique such that we can write 



Trexp 



(17) 



a=0 



where H{^, o"") is the Hamiltonian of each replica of the system given above and a is the 
replica index. Next, we first average this partition function over the dilution. At this point 
we remark that we consider the case of sequential updating. Parallel updating does not 
contain any additional difficulties [14]. Furthermore, we do not write explicitly the term 
proportional to h in the Hamiltonian since it does not contain the Cij such that it can be 
inserted at any point of the calculation. Due to the central limit theorem we can easily see 
that JjP- ~ 0{{cN)~^^'^), so that we can expand 



Z" = ^exp|f J^J^J^afa^ 



i<j 



(18) 



The average over the dilution variables Cij with i < j denoted by is then straightforward 

2 " 



(^")e = Trn 



2± Trexp 




(19) 



where we have used that Jij ^ 



0{N-^/^). Finally, noting that (Jij)'^ {ac)/N + 
0{{cN)~'^) as iV — oo, with a = p/ {cN) the finite loading capacity, we write the replicated 
partition function averaged over the dilution as 



(Z")e = Trexp 

<T 



13 Ci - C2 



^ I3^a c- (ci - C2f 



AN 



(20) 
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Using a Hubbard-Stratonovich transformation this can be further expressed as 



(Z")e = (Trexp 



(21) 



with indicating the average over the dij, a set of i.i.d.r.v. for i < j and symmetric 
with da = 0. They obey a gaussian distribution with mean (dij) = and variance (dfj) = 
as/{cN) with s = c — {ci — 02)^. This shows that the symmetric dilution (|lflj) . being a 
form of multiphcative noise, introduces an effective Hamiltonian where the learning rule 
now contains an additive noise term plus a scaling of the Hebbian part 

The scaling factor expresses the influence of explicitly allowing an average amount C2 = 
P[cij = — 1] of anti-Hebbian couplings. For C2 = the scaling term is 1 and in this case the 
expression (|22|) agrees with the results of Sompolinsky [20] for the Hopfield model Q = 2 
and with the results of Theumann and Erichsen [27] for the symmetrically diluted Q-Ising 
model. 

An analogous calculation can be performed for parallel updating of the neurons leading 
precisely to eq. (|21|) with aj replaced by r" and with the factor 1/2 removed. In all 
these calculations we have only used the first and second moments of the probability 
distribution for the Cij. This is due to the mean-field character of the network we have 
treated. Therefore, one can easily extend this result to any (symmetric) multiplicative 
noise such that 

jm_j, Vii ^ ja _ VvZIj. . , f^d' (23) 

with obvious meaning of the superscripts m and a, with d'-- = AA(0, 1) a gaussian with mean 
zero and variance 1, and with rj = (rjf-) and s = rj — {r]ij)'^ characterizing the multiplicative 
noise. 

To close this Section we remark that when we are interested in the further calculation 
of the free energy, e.g., in order to obtain the equilibrium fixed-point equations and the 
loading capacity we have to average the partition function (PT|) over the pattern distribution 
employing the standard techniques. We refer to the literature for the final results [14,27]. 



4 General dilution and dynamics of the Q-Ising model 

For asymmetric dilution (recall, e.g., ()11() ') the system we have defined does not obey 
detailed balance. In this case we have to resort to techniques used in non-equilibrium 
statistical mechanics. The method we use to study the effect of general dilution in the 
dynamics is the generating functional approach [18,19]. 
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The idea of this approach is to look at the probabihty to find a certain microscopic path 
in time. The basic tool to study the statistics of these paths is the generating function 



t N 



<T(o) a{t) \ 



=0 i=l 



where the dilution and pattern averages are denoted by • )^)^ and where P[(7(0)...(7(t)] 
is the probability to have a certain path in phase space 



t-i 



F[aiO)...ait)] = P[cr(0)] JJ W[ait')\ait' - 1)] 



(25) 



t'=o 



with W[<7{t')\a{t' — 1)] the transition probabilities from a{t' — 1) to (T{t'). For the Q-Ising 
network with parallel updating they read 



N 



W[a{t')\<T{t'-l)]=ll 



exp (Paiif) J>j{t' - 1) - PbaKt'^ 



t=i Tr exp Jfj^^j {f - 1) - Pba^) 



(26) 



Wc remark again that sequential updating can be treated in an analogous way. Fur- 
thermore, we note that one can obtain all the order parameters of the system through 
derivation of the generating function, e.g., the overlap between the network configuration 
and an embedded pattern is given by 



(27) 



Introducing the local fields h = {hi{s) = Jij^j{s)} and their conjugates h, we arrive at 

((z[*]> = j {czh}{dh} ^ ... p[^(o)] n n n<s)\h,{s - 1), h] 

^ cr{Q) cr(t) s>o i 

t 

= exp (iV^[cr, h]) Yl Yl exp (^ihi{s)hi{s) - i0i(s)c7i(s)) (28) 



i s=0 

where the disorder and dilution are put in one term 

t 



•^[o-' ^] = TV 



exp 



"^EE^^(^)E4'^j(*) 



i s=Q 



(29) 



We do not report the whole treatment of the generating function here but refer to, e.g., [19] 
for more details on the method. We do discuss in detail, however, the dilution average since 
this is precisely the subject of our study. 
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As in the statics we use that the diluted coupHngs Jfj are of order 0{{cN) ^/'^) and 
that the Jij ^ 0{N^^^'^). We note that, in principle, diagonal coupling terms usually taken 
to be of the form Ju = a Jq can be present, but they are taken out separately because they 
do not need to be averaged over. Introducing then bij = Y2s ^«('5)<7j(s) and using the fact 
that the distribution for the asymmetric dilution eq. (jH)) is i.i.d.r.v. for i < j we can write 



n(expH(J!^.6,, + 4^)] 



i<j 
Kj 



- J'ij (cijbij -\- Cjibj 



-i,j2.(c,,&,,+c,,6,,f + 0((ciV)-3/2; 



. Ci — C2 , , S ,0 . ,0 U — S 



■ ^ T h 

1 Jij 0' 



jfAwj^-bij + w-.bjif 



2c2 

where we have introduced a number of shorthand notations 



(30) 



S = C-{CI-C2f , 



'W± 



-p U {CijCji) {Cij} 



{cfj) - (Cij)^ c - (ci - Cs) 

(31) 

We remark that the parameter F is a measure for the symmetry in the Cjj, and takes 
values in the range [— 1 , 1] ; F = 1 is complete symmetry, F = — 1 is complete antisymmetry. 
Finally, a Hubbard-Stratonovich transformation can be done as before leading to 



n 



exp 



Cl — C2 

-i{ Jij + [w+dij + w-dji])bij 



(32) 



where the dij are a set of i.i.d.r.v. for i < j, symmetric, dij = dji, and obeying a gaussian 



probability distribution with (dij) = and (dfj) = sJfj/c^. This shows that also general 
asymmetric dilution can be written as additive noise in the learning rule with a scaling of 
the Hebbian part 



■Jij 



C 



^/c — s ^ I as . ,, , 



(33) 



where now d[j = Af{0, 1). We note explicitly that the parameters v and w (recall eq. ©) 
do not play a role in the calculation. Only the mean c, the variance s and the covariance 
u of the probability distribution for the random dilution are important. Again, for C2 = 
the scaling factor expressing the explicit mixing in of anti-Hebbian couplings is 1 and 
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in the symmetric case (SD), F = 1 in addition, such that the additive noise reduces to 
Jfj = Jij + dij (see eq. 

As in the statics, the calculation of the dynamics can be pursued by doing the aver- 
age over the patterns and expressions for the overlap, correlation functions and response 
functions can be obtained. This is beyond the purpose of the present contribution and will 
be presented in [14]. Finally, the remarks made before that sequential dynamics can be 
treated similarly and that the effective dynamics is formally the same, boil down to the 
fact that the explicit form of the transition rates is not needed to derive the effective path 
average. Only the initial conditions need to factorize over the site index i and this is a 
characteristic property of mean- field systems. 

Using the generating functional approach it is clear that the effect of random dilution 
on the learning rule in neural networks based on other types of multi-state neurons [1], e.g., 
Blume-Emery-Griffiths neurons. Potts neurons, Ashkin- Teller neurons can be examined in 
an analogous way. 

5 Concluding remarks 

In this work we have examined the effects of general random dilution, which can be con- 
sidered as a static disruption of the learning process and, hence, as a form of multiplicative 
noise in the Hebbian learning rule, on the statics and dynamics of Q-Ising multi-state 
neural networks. A fraction of the couplings is explicitly taken to be anti-Hebbian. Both 
sequential updating and parallel updating of the neurons are allowed. It is shown, using 
replica mean-field theory, that for symmetric dilution the effect on the learning rule ap- 
pears as additive gaussian noise together with a scaling of the Hebbian part. This scaling 
is a measure for the average amount of anti-Hebbian couplings and becomes 1 when no 
such couplings are present. This extends previous results in the literature. Moreover, for 
general dilution, including asymmetric forms, a similar result is obtained using the gener- 
ating functional approach employed in studies of the dynamics of disordered systems. The 
additive noise is determined as a function of the mean, the variance and the covariance of 
the probability distribution characterizing the dilution. We conjecture that this result is 
valid for any network of mean-field type. 

Although this is beyond the scope of the present work it is relevant to remark that it 
can be shown that the type of multi-state networks studied here are robust against the 
interference of static noise coming from random dilution (cfr., e.g., [14,20]) in the sense that 
the quality of the retrieval properties is affected very little, unless the amount of dilution 
is very high. As an illustration of this fact we show in fig. 2 the information content, 
being the product of the loading capacity and the mutual information, of a Q = 3-Ising 
neural network with parameters as indicated in the figure caption for several amounts of 
symmetric dilution c. For more details on this we refer to [14]. 

The fact that the effect of random dilution can be expressed as additive noise in the 
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Figure 2: Information content i as a function of the loading capacity a for the Q = 3 
Ising model with uniform patterns {A = 2/3), b = 0.5, T = 0, and symmetric dilution with 
C2 = and ci = c = 0.01, 0.1 and 1.0. 



learning rule makes the analytical calculations on these networks easier and more trans- 
parent and can be of help in the non-trivial numerical simulations of diluted systems. 
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